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Abstract 

We develop new methods to statically bound the resources needed for the exe¬ 
cution of systems of concurrent, interactive threads. Our study is concerned with 
a synchronous model of interaction based on cooperative threads whose execution 
proceeds in synchronous rounds called instants. Our contribution is a system of com¬ 
positional static analyses to guarantee that each instant terminates and to bound the 
size of the values computed by the system as a function of the size of its parameters 
at the beginning of the instant. 

Our method generalises an approach designed for first-order functional languages 
that relies on a combination of standard termination techniques for term rewriting 
systems and an analysis of the size of the computed values based on the notion of 
quasi-interpretation. We show that these two methods can be combined to obtain an 
explicit polynomial bound on the resources needed for the execution of the system 
during an instant. 

As a second contribution, we introduce a virtual machine and a related bytecode thus 
producing a precise description of the resources needed for the execution of a system. 
In this context, we present a suitable control flow analysis that allows to formulate 
the static analyses for resource control at byte code level. 


1 Introduction 

The problem of bounding the usage made by programs of their resources has already 
attracted considerable attention. Automatic extraction of resource bounds has mainly fo¬ 
cused on (first-order) functional languages starting from Cobham’s characterisation [T5] 
of polynomial time functions by bounded recursion on notation. Following work, see e.g. 
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mmi nn 12m, has developed various inference techniques that allow for efficient analyses 
while capturing a sufficiently large range of practical algorithms. 

Previous work mm has shown that polynomial time or space bounds can be obtained by 
combining traditional termination techniques for term rewriting systems with an analysis 
of the size of computed values based on the notion of quasi-interpret at ion. Thus, in a 
nutshell, resource control relies on termination and bounds on data size. 

This approach to resource control should be contrasted with traditional worst case exe¬ 
cution time technology (see, e.g., : the bounds are less precise but they apply to a 

larger class of algorithms and are functional in the size of the input, which seems more 
appropriate in the context of the applications we have in mind (see below). In another 
direction, one may compare the approach with the one based on linear logic (see, e.g., [7]): 
while in principle the linear logic approach supports higher-order functions, it does not 
offer yet a user-friendly programming language. 

In p2(I], we have considered the problem of automatically inferring quasi-interpretations 
in the space of multi-variate max-plus polynomials. In [I], we have presented a virtual 
machine and a corresponding bytecode for a first-order functional language and shown 
how size and termination annotations can be formulated and verified at the level of the 
bytecode. In particular, we can derive from the verification an explicit polynomial bound 
on the space required to execute a given bytecode. 

In this work, we aim at extending and adapting these results to a concurrent framework. 
As a starting point, we choose a basic model of parallel threads interacting on shared 
variables. The kind of concurrency we consider is a cooperative one. This means that by 
default a running thread cannot be preempted unless it explicitly decides to return the con¬ 
trol to the scheduler. In preemptive threads, the opposite hypothesis is made: by default 
a running thread can be preempted at any point unless it explicitly requires that a series 
of actions is atomic. We refer to, e.g., EE! for an extended comparison of the cooperative 
and preemptive models. Our viewpoint is pragmatic: the cooperative model is closer to 
the sequential one and many applications are easier to program in the cooperative model 
than in the preemptive one. Thus, as a first step, it makes sense to develop a resource 
control analysis for the cooperative model. 

The second major design choice is to assume that the computation is regulated by a notion 
of instant. An instant lasts as long as a thread can make some progress in the current 
instant. In other terms, an instant ends when the scheduler realizes that all threads are 
either stopped, or waiting for the next instant, or waiting for a value that no thread can 
produce in the current instant. Because of this notion of instant, we regard our model 
as synchronous. Because the model includes a logical notion of time, it is possible for a 
thread to react to the absence of an event. 

The reaction to the absence of an event is typical of synchronous languages such as Es- 
terel jSJ. Boussinot et al. have proposed a weaker version of this feature where the 
reaction to the absence happens in the following instant ^ and they have implemented 
it in various programming environments based on C, JAVA, and SCHEME EH- Applica¬ 
tions suited to this programming style include: event-driven applications, graphical user 
interfaces, simulations (e.g. A r -bodies problem, cellular automata, ad hoc networks), web 
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services, multiplayer online games, ... Boussinot et al. have also advocated the relevance 
of this concept for the programming of mobile code and demonstrated that the possibility 
for a ‘synchronous’ mobile agent to react to the absence of an event is an added factor 
of flexibility for programs designed for open distributed systems, whose behaviours are 
inherently difficult to predict. These applications rely on data structure such as lists and 
trees whose size needs to be controlled. 

Recently, Boudol j!2| has proposed a formalisation of this programming model. Our anal¬ 
ysis will essentially focus on a small fragment of this model without higher-order functions, 
and where the creation of fresh memory cells (registers) and the spawning of new threads 
is only allowed at the very beginning of an instant. We believe that what is left is still ex¬ 
pressive and challenging enough as far as resource control is concerned. Our analysis goes 
in three main steps. A first step is to guarantee that each instant terminates fSection l8.ll) . 
A second step is to bound the size of the computed values as a function of the size of the 
parameters at the beginning of the instant (Section 18.21) . A third step, is to combine the 
termination and size analyses. Here we show how to obtain polynomial bounds on the 
space and time needed for the execution of the system during an instant as a function of 
the size of the parameters at the beginning of the instant (Section 18.81) . 

A characteristic of our static analyses is that to a great extent they make abstraction of 
the memory and the scheduler. This means that each thread can be analysed separately, 
that the complexity of the analyses grows linearly in the number of threads, and that an 
incremental analysis of a dynamically changing system of threads is possible. Preliminary 
to these analyses, is a control flow analysis fSection l2.ll) that guarantees that each thread 
performs each read instruction (in its body code) at most once in an instant. This con¬ 
dition is instrumental to resource control. In particular, it allows to regard behaviours as 
functions of their initial parameters and the registers they may read in the instant. Taking 
this functional viewpoint, we are able to adapt the main techniques developed for proving 
termination and size bounds in the first-order functional setting. 

We point out that our static size analyses are not intended to predict the size of the sys¬ 
tem after arbitrarily many instants. This is a harder problem which in general requires 
an understanding of the global behaviour of the system and/or stronger restrictions on the 
programs we can write. For the language studied in this paper, we advocate a combination 
of our static analyses with a dynamic controller that at the end of each instant checks the 
size of the parameters of the system and may decide to stop some threads taking too much 
space. 

Along the way and in appendix El we provide a number of programming examples il¬ 
lustrating how certain synchronous and/or concurrent programming paradigms can be 
represented in our model. These examples suggest that the constraints imposed by the 
static analyses are not too severe and that their verification can be automated. 

As a second contribution, we describe a virtual machine and the related bytecode for our 
programming model (Section 0]). This provides a more precise description of the resources 
needed for the execution of the systems we consider and opens the way to the verification 
of resource bounds at the bytecode level, following the ‘typed assembly language’ approach 
adopted in fl. for the purely functional fragment of the language. More precisely, we de- 
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scribe a control flow analysis that allows to recover the conditions for termination and size 
bounds at bytecode level and we show that the control flow analysis is sufficiently liberal 
to accept the code generated by a rather standard compilation function. 

Proofs are available in appendix ITU 

2 A Model of Synchronous Cooperative Threads 

A system of synchronous cooperative threads is described by (1) a list of mutually recursive 
type and constructor definitions and (2) a list of mutually recursive function and behaviour 
definitions relying on pattern matching. In this respect, the resulting programming lan¬ 
guage is reminiscent of ERLANG which is a practical language to develop concurrent 
applications. The set of instructions a behaviour can execute is rather minimal. Indeed, 
our language can be regarded as an intermediate code where, for instance, general pattern¬ 
matching has been compiled into a nesting of if _then_else constructs and complex control 
structures have been compiled into a simple tail-recursive form. 

Types We denote type names with t, t',... and constructors with c, c',. • • We will also 
denote with r, r',... constructors of arity 0 and of ‘reference’ type (see equation of kind (2) 
below) and we will refer to them as registers (thus registers are constructors). The values 
v, v',... computed by programs are first order terms built out of constructors. Types and 
constructors are declared via recursive equations that may be of two kinds: 

(1) t = ... | co/fi,...,f n | ... 

(2) t = Ref(t') with ... | r = v \ ... 

In (1) we declare a type t with a constructor c of functional type (fi,..., t n ) —> t. In (2) 
we declare a type t of registers referencing values of type t’ and a register r with initial 
value v. As usual, type definitions can be mutually recursive (functional and reference 
types can be intermingled) and it is assumed that all types and constructors are declared 
exactly once. This means that we can associate a unique type with every constructor 
and that with respect to this association we can say when a value is well-typed. For 
instance, we may define the type nat of natural numbers in unary format by the equation 
nat = z | s of nat and the type Hist of linked lists of natural numbers by the equations 
nlist = nil | cons of ( nat , Hist) and Hist = Ref(nlist) with r = cons(z, r). The last definition 
declares a register r of type Hist with initial value the infinite (cyclic) list containing only 
z’s. 

Finally, we have a special behaviour type, beh. Elements of type beh do not return a value 
but produce side effects. We denote with f3 either a regular type or beh. 

Expressions We let x,y,... denote variables ranging over values. The size |u| of a value 
v is defined by |c| = 0 and |c(rq,... ,v n )\ = 1 + |wi| + • • • + \v n \. In the following, we 
will use the vectorial notation a to denote either a vector ai,..., a n or a sequence cq ■ • ■ a n 
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of elements. We use cr, a',... to denote a substitution [v/x], where v and x have the 
same length. A pattern p is a well-typed term built out of constructors and variables. In 
particular, a shallow linear pattern p is a pattern c(xi,..., x n ), where c is a constructor of 
arity n and the variables x\,... ,x n are all distinct. Expressions, e, and expression bodies, 
eb, are defined as: 

e ::= x \ c(e u .. e k ) || f(e i,...,e n ) 
eb ::= e | match x with p then eb else eb 

where / is a functional symbol of type (ti,... ,t„) —> t, specified by an equation of the 
kind f(x i,..., x n ) = eb, and where p is a shallow linear pattern. 

A closed expression body eb evaluates to a value v according to the following standard 
rules: 


(ei) 


r fl r 


( e 2) 


e If v 


c ( e ) fl c(v) 


(es) 


ejfv, /(x) = eb, [v/x]eH ]. v 

/(e) k v 


_ [v/x]efci If v 

(e 4 ) /match c(v) with c(x) 

\then eb\ else eb 2 


If v 


_ eb 2 fl v c / d 

(es) f match c(v) with d(x) 
\then eb\ else eb 2 


JJ- v 


Since registers are constructors, rule (ei) is a special case of rule (e 2 ); we keep the rule for 
clarity. 


Behaviours Some function symbols may return a thread behaviour b,b',... rather than 
a value. In contrast to ‘pure’ expressions, a behaviour does not return a result but produces 
side-effects by reading and writing registers. A behaviour may also affect the scheduling 
status of the thread executing it. We denote with b, b',... behaviours defined as follows: 

b ::= stop | /(e) | yield.b \ next.f(e ) | q := e.b \ 

read p with p\ b\ \ ■ ■ ■ \ p n =>• b n \ [_] /(e) | 

match x with c(x) then b± else b- 2 

where: (i) / is a functional symbol of type ti,... ,t n —>• beh, defined by an equation 
/(x) = b, (ii) p,p',... range over variables and registers, and (iii) pi,...,p n are either 
shallow linear patterns or variables. We also denote with [_] a special symbol that will be 
used in the default case of read expressions (see the paragraph Scheduler below). Note 
that if the pattern p { is a variable then the following branches including the default one 
can never be executed. 

The effect of the various instructions is informally described as follows: stop, terminates 
the executing thread for ever; yield.b, halts the execution and hands over the control to 
the scheduler — the control should return to the thread later in the same instant and 
execution resumes with 6; /(e) and next.f(e ) switch to another behaviour immediately 
or at the beginning of the following instant; r := e.b, evaluates the expression e, assigns 
its value to r and proceeds with the evaluation of 6; read r with Pi =>• b\ \ • ■ • \ p n b n j 
[_] b, waits until the value of r matches one of the patterns p\,... ,p n (there could be no 
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delay) and yields the control otherwise; if at the end of the instant the thread is always 
stuck waiting for a matching value then it starts the behaviour b in the following instant; 
match v with p then b\ else b 2 filters the value v according to the pattern p, it never blocks 
the execution. Note that if p is a pattern and v is a value there is at most one matching 
substitution a such that v = crp. 

Behaviour reduction is described by the 9 rules below. A reduction ( b, s)—*•(&', s') means 
that the behaviour b with store s runs an atomic sequence of actions till b', producing a 
store s', and returning the control to the scheduler with status X. A status is a value in 
{N, R, S,W} that represents one of the four possible state of a thread — N stands for next 
(the thread will resume at the beginning of the next instant), R for run, S for stopped, 
and W for wait (the thread is blocked on a read statement). 


(bi 


(stop, s ) —> (stop, s ) 


(b 2 ) 


(yield.b, s ) -3- (b, s 


(ba) 


( next.f(e),s ) (/(e), s) 


(b 4 ) 

(be) 


([v/x]&i ,s) ^ (b',s ') 

( match c(v) \ 

with c(x) , s I —>■ (b', s 
then b\ else b 2 ) 

no pattern matches s(r) 
(read r.. ., s) (read r ..., s) 



(b 2 ,s) 4 (b', 

s '), 

c / d 

/) ( b s) 

/ match c(v) 
with d(x) 




\then b\ else b 2 

) 


- ihrd - 

s(r) = up, 

(ab, s 

0 - (b\ s') 

\°7) 

(read r with ■ ■ ■ \ p 

=> b 

1 ••■,«)* (b',s') 


(bs) 


e JJ. v, 


/(x) = b, ([v/x]b,s) 4 (b',s') 
(/( e ), s) 4 (, b',s') 


efla, (b,s[v/r() ^ (b',s') 
(r := e.b, s) 4 (b', s') 


(bg) 


We denote with be either an expression body or a behaviour. All expressions and be¬ 
haviours are supposed to be well-typed. As usual, all formal parameters are supposed to be 
distinct. In the match x with c( y) then be i else be 2 instruction, be i may depend on y but 
not on x while be 2 may depend on x but not on y. 


Systems We suppose that the execution environment consists of n threads and we asso¬ 
ciate with every thread a distinct identity that is an index in Z n = {0,1,... , n — 1}. We 
let B, B',... denote systems of synchronous threads, that is finite mappings from thread 
indexes to pairs (behaviour, status). Each register has a type and a default value — its 
value at the beginning of an instant — and we use s, s',... to denote a store, an association 
between registers and their values. We suppose that at the beginning of each instant the 
store is s Q , such that each register is assigned its default value. If B is a system and i 6 Z n 
is a valid thread index then we denote with B\(i ) the behaviour executed by the thread i 
and with B 2 (i) its current status. Initially, all threads have status R, the current thread 
index is 0, and B\(i ) is a behaviour expression of the shape /(v) for all i G Z n . System 
reduction is described by a relation ( B,s,i ) —> (B',s',i')\ the system B with store s and 
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current thread (index) i runs an atomic sequence of actions and becomes (.'). 


(B 1 {i),s)^(b l ,s'), B 2 (i) = R, B' = B[(V,X)/i], Af(B',sf,i) = k 


(si) 


( s 2) 


(B,s,i)^(B f m(k),R)/k],s',k) 


(B l ({), S )^(b',s f ), B 2 (i) = R, B' = B[(b',X)/i], M(B’,s',i)}, 

_ B" = U{B', s'), A f{B", s 0 , 0) = k _ 

(B,s,i) -»■ ( B",s 0 ,k ) 


Scheduler The scheduler is determined by the functions J\f and U. To ensure progress 
of the scheduling, we assume that if J\f returns an index then it must be possible to run the 
corresponding thread in the current instant and that if A f is undefined (denoted A/(. •.) T) 
then no thread can be run in the current instant. 

If A f(B, s, i ) = k then B 2 {k) = R or ( B 2 (k) = W and 

Bi(k) = read r with ■ ■ ■ \ p =>• b \ ... and some pattern 
matches s(r) i.e., 3a ap = s(r) ) 

If A r(B, s, i ) t then VA: € Z n , B 2 (k ) G {N, S} or ( B 2 (k) = W, 

Bi(k) = read r with ..., and no pattern matches s(r) ) 

When no more thread can run, the instant ends and the function U performs the 
following status transitions: N —> R, W —> R. We assume here that every thread in status 
W takes the branch at the beginning of the next instant. Note that the function 

A f is undefined on the updated system if and only if all threads are stopped. 

( (6,5) if £(*) = (&, 5) 

U(B, s)(i) = < (6, R) if B(i) = ( b , N) 

{ (/(e), R) if B(i) = (read r with • • • | [_] /(e), W) 


Example 1 (channels and signals) The read instruction allows to read a register sub¬ 
ject to certain filter conditions. This is a powerful mechanism which recalls, e.g., Linda 
communication m and that allows to encode various forms of channel and signal com¬ 
munication. 

(1) We want to represent a one place channel c carrying values of type t. We introduce 
a new type ch(t ) = empty | full of t and a register c of type Ref(ch(t )) with default value 
empty. A thread should send a message on c only if c is empty and it should receive a 
message only if c is not empty (a received message is discarded). These operations can be 
modelled using the following two derived operators: 

send(c,e).b =def rea d c with empty =4> c := fu 11 (e) .6 
receive (c, x). b =def r ^d c with full (a:) => c := empty, b 

(2) We want to represent a hfo channel c carrying values of type t such that a thread 
can always emit a value on c but may receive only if there is at least one message in the 
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channel. We introduce a new type fch(t ) = nil | cons of t,fch(t ) and a register c of type 
Ref(fch(t )) with default value nil. Hence a fifo channel is modelled by a register holding a 
list of values. We consider two read operations — freceive to fetch the first message on the 
channel and freceiveall to fetch the whole queue of messages — and we use the auxiliary 
function insert to queue messages at the end of the list: 


fsend(c, e).b = def 
freceive(c,x).b = def 
freceiveall(c,x).b = de f 


read c with l => c := insert(e,l).b 

read c with cons(x,Z) c := l.b 

read c with cons (y,l) c := nil.[cons(y, l)/x]b 


insert(x,l ) = match l with cons (y,V) then cons (y, insert(x, l')) 

else cons(x, nil) 


(3) We want to represent a signal s with the typical associated primitives: emitting a signal 
and blocking until a signal is present. We define a type sig = abst | prst and a register s of 
type Ref (sig) with default value abst, meaning that a signal is originally absent: 


emit(s).b = de f s := prst .b wait(s).b = de f read s with prst => b 


Example 2 (cooperative fragment) The cooperative fragment of the model with no 
synchrony is obtained by removing the next instruction and assuming that for all read 
instructions the branch _] /(e) is such that /(...) = stop. Then all the interesting 

computation happens in the first instant; threads still running in the second instant can 
only stop. By using the representation of fifo channels presented in Example\^(2) above, 
the cooperative fragment is already powerful enough to simulate, e.g., Kahn networks m- 


Next, to make possible a compositional and functional analysis for resource control, we 
propose to restrict the admissible behaviours and we define a simple preliminary control 
flow analysis that guarantees that this restriction is met. We then rely on this analysis to 
define a symbolic representation of the states reachable by a behaviour. Finally, we extract 
from this symbolic control points suitable order constraints which are instrumental to our 
analyses for termination and value size limitation within an instant. 


2.1 Read Once Condition 

We require and statically check on the call graph of the program (see below) that threads 
can perform any given read instruction at most once in an instant. 

1. We assign to every read instruction in a system a distinct fresh label, y, and we 
collect all these labels in an ordered sequence, y±,... ,y m . In the following, we will 
sometimes use the notation read; y ) g with ... in the code of a behaviour to make 
visible the label of a read instruction. 

2. With every function symbol / defined by an equation /(x) = b we associate the set 
L(f) of labels of read instructions occurring in b. 
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3. We define a directed call graph G = (N, E ) as follows: N is the set of function symbols 
in the program defined by an equation /(x) = b and (f,g) £ E if g e Call(b ) where 
Call(b) is the collection of function symbols in N that may be called in the current 
instant and which is formally defined as follows: 

Call(stop) = Call (next. g(e)) = 0 Call(f(e )) = {/} 

Call(yield.b) = Call(g := e.b) = Call(b) 

C all (match x with p then b\ else bo) = Call(bi) U Call (bo) 

Call(read g with p\ =>■ &i | • • • | p n =>• b n | [_] =>■ b) = (J i=1 n Call (pi) 

We write fE*g if the node g is reachable from the node / in the graph G. We denote 
with R(f) the set of labels (J{L(g) | fE*g} and with y f the ordered sequence of 
labels in R(f). 

The definition of Call is such that for every sequence of calls in the execution of a 
thread within an instant we can find a corresponding path in the call graph. 

Definition 3 (read once condition) A system satisfies the read once condition if in the 
call graph there are no loops that go through a node f such that L(f) 0. 

Example 4 (alarm) We consider the representation of signals as in Example^fS). We 
assume two signals sig and ring. The behaviour alarm(n,m) will emit a signal on ring if 
it detects that no signal is emitted on sig for m consecutive instants. The alarm delay is 
reset to n if the signal sig is present. 

alarm (x,y) = match y with s (y') 

then read/ u \ sig with prst =>■ next.alarm(x,x) | [_] =>• alarm (x,y f ) 
else ring := prst.stop 

Hence u is the label associated with the read instruction and L(alarm) = {u }. Since the 
call graph has just one node, alarm, and no edges, the read once condition is satisfied. 

To summarise, the read once condition is a checkable syntactic condition that safely 
approximates the semantic property we are aiming at. 

Proposition 5 If a system satisfies the read once condition then in every instant every 
thread runs every read instruction at most once (but the same read instruction can be run 
by several threads). 

The following simple example shows that without the read once restriction, a thread 
can use a register as an accumulator and produce an exponential growth of the size of the 
data within an instant. 
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Example 6 (exponentiation) We recall that nat = z | s of nat is the type of tally nat¬ 
ural numbers. The function dble defined below doubles the value of its parameter so that 
\dble(n) \ = 2|n|. We assume r is a register of type nat with initial value s(z). Now consider 
the following recursive behaviour: 

dble(n) = match n with s(n / ) then s(s (dble(n'))) else z 

exp(n ) = match n with s(n / ) 

then read rwithm r := dble(m).exp(n f ) 
else stop 

The function exp does not satisfy the read once condition since the call graph has a loop 
on the exp node. The evaluation of expfn) involves \n\ reads to the register r and, after 
each read operation, the size of the value stored in r doubles. Hence, at end of the instant, 
the register contains a value of size 2l n L 

The read once condition does not appear to be a severe limitation on the expressiveness 
of a synchronous programming language. Intuitively, in most synchronous algorithms every 
thread reads some bounded number of variables before performing some action. Note that 
while the number of variables is bounded by a constant, the amount of information that 
can be read in each variable is not. Thus, for instance, a ‘server’ thread can just read 
one variable in which is stored the list of requests produced so far and then it can go on 
scanning the list and replying to all the requests within the same instant. 


2.2 Control Points 

From a technical point of view, an important consequence of the read once condition is 
that a behaviour can be described as a function of its parameters and the registers it may 
read during an instant. This fact is used to associate with a system satisfying the read 
once condition a finite number of control points. 

A control point is a triple (/( p), be,i ) where, intuitively, / is the currently called function, 
p represents the patterns crossed so far in the function definition plus possibly the labels 
of the read instructions that still have to be executed, be is the continuation, and i is an 
integer flag in (0,1,2} that will be used to associate with the control point various kinds 
of conditions. 

If the function / returns a value and is defined by the equation /(x) = eb, then we asso¬ 
ciate with / the set C(f, x, eb) defined as follows: 


C(f, p, eb) = case eb of 


e 


f match x with c(y) 
\then eb\ else eb 2 


{(/(P). eb, 0)} 

{(/(P), eb, 2)} U C(f, [c(y)/x]p, ebT) U C(f, p, eb 2 ) 


On the other hand, suppose the function / is a behaviour defined by the equation /(x) = b. 
Then we generate a fresh function symbol f + whose arity is that of / plus the size of R(f ), 
thus regarding the labels yj (the ordered sequence of labels in R(f)) as part of the formal 
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parameters of f + . The set of control points associated with f + is the set C(f + , (x ■ y/), b ) 
defined as follows: 


C(P 

(Ci) 

(C 2 ) 

(Cs) 

(C 4 ) 

(Cs) 


, p, b) = case b of 


stop : {(/ + (p), b, 2 )} 

5(e) :{(/ + (p),M)} 

yield.b 1 : {(/ + (p), b, 2)} U C(f+, p, b') 

next.g{e ) : {(/+(p), 6 , 2 ), (/ + (p), 3 (e), 2 )} 

5 := e.b' : {(/+(p), 6 , 2), (/ + (p),e, 1)} U C(/+p, b') 

(match x with c(y)\ . {(/ + (p), b, 2)} U C(f + , ([c(y)/x]p), b x ) 

\then b\ else 62 ) ’ UC(/ + ,p, 62 ) 

/«,_j -Li i\ {(/ + (p)> b, 2 ), (/ + (p), g(e), 2 )} 

(r^L mTcTel 1 ) : UC(/+,([p 1 /s/]p), 6 1 ) U... 

\Pn => On l-l => Sl e ) ) UC(/+(bn/»]p),i-n) 

By inspecting the definitions, we can check that a control point (/(p), be,i) has the 

property that Var(be) C Var(p). 


( C 6) 


(C7) 


Definition 7 Tn instance of a control point (/( p), be,i) is an expression body or a be¬ 
haviour be = cr(be), where a is a substitution mapping the free variables in be to values. 


The property of being an instance of a control point is preserved by expression body 
evaluation, behaviour reduction and system reduction. Thus the control points associated 
with a system do provide a representation of all reachable configurations. Indeed, in 
Appendix El we show that it is possible to define the evaluation and the reduction on pairs 
of control points and substitutions. 

Proposition 8 Suppose (B,s,i) —> ( B',s',i') and that for all thread indexes j G Z n , Bi(j ) 
is an instance of a control point. Then for all j G Z n , we have that B\ (j) is an instance 
of a control point. 


In order to prove the termination of the instant and to obtain a bound on the size of 
computed value, we associate order constraints with control points: 


Control point 

Associated constraint 

(/(p)> e , 0 ) 

/( P) ^0 e 

(/ + (p), 5 (e), 0 ) 

/ + (p) >-0 g + (e, y g ) 

(/ + (P),e,l) 

f + ( p) ^1 e 

(/ + (p), be, 2) 

no constraints 


A program will be deemed correct if the set of constraints obtained from all the function 
definitions can be satisfied in suitable structures. We say that a constraint e y, e! has index 
i. We rely on the constraints of index 0 to enforce termination of the instant and on those 
of index 0 or 1 to enforce a bound on the size of the computed values. Note that the 
constraints are on pure first order terms, a property that allows us to reuse techniques 
developed in the standard term rewriting framework (cf. Section 01) . 
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Example 9 With reference to Examplewe obtain the following control points: 


(alarm + (x, y, u), match . ..,2) 

( alarm + (x , y, u ), prst, 1) 

( alarm + (x , s (y'),u), read ..., 2) 
(a/arm + (x, s(y / ), prst), next .alarm{x, x), 2) 


(alarm + (x, y, u), ring := prst. stop, 2) 

( alarm + {x , z, w), stop, 2) 

( alarm + (x , s (y'),u), alarm(x, y'), 2) 

( alarm + (x , s(y'), prst), alarm(x, x), 2) 


The triple ( alarm + (x , y, u), prst, 1) is t/ie only control point with a flag different from 2. It 
corresponds to the constraint alarm + (x,y,u ) >~i prst, where u is the label associated with 
the only read instruction in the body of alarm. We note that no constraints of index 0 are 
generated and so, in this simple case, the control flow analysis can already establish the 
termination of the thread and all is left to do is to check that the size of the data is under 
control, which is also easily verified. 

In Example [21 we have discussed a possible representation of Kahn networks in the 
cooperative fragment of our model. In general Kahn networks there is no bound on the 
number of messages that can be written in a hfo channel nor on the size of the messages. 
Much effort has been put into the static scheduling of Kahn networks (see, e.g., [221 H3 
EH)- This analysis can be regarded as a form of resource control since it guarantees that 
the number of messages in hfo channels is bounded (but says nothing about their size). 
The static scheduling of Kahn network is also motivated by performance issues, since it 
eliminates the need to schedule threads at run time. Let us look in some detail at the 
programming language LUSTRE, that can be regarded as a language for programming 
Kahn networks that can be executed synchronously. 


Example 10 (read once vs. Lustre) A Lustre network is composed of four types of 
nodes: the combinatorial node, the delay node, the when node, and the merge node. Each 
node may have several input streams and one output stream. The functional behaviour of 
each type of node is defined by a set of recursive definitions. For instance, the node When 
has one boolean input stream b — with values of type bool = false | true — and one input 
stream s of values. A When node is used to output values from s whenever b is true. This 
behaviour may be described by the following recursive definitions: When( false ■ b,x ■ s) = 
When(b , s), When{ true • b, x ■ s) — x ■ When(b, s), and When(b, s) = e otherwise. Here is a 
possible representation of the When node in our model, where the input streams correspond 
to one place channels b, c (cf. ExampleU^l)), the output stream to a one place channel d 
and at most one element in each input stream is processed per instant. 

Whenif) = read( u \ b with 

full(true) =>• read^ cwith full (a;) d := x.next.WhenQ | [_] =4> When () 

| full (false) => next.Whenf) 

| [_] =>• When() 

While the function When has no formal parameters, we consider the function When + with 
two parameters u and v in our size and termination analyses. 
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3 Resource Control 


Our analysis goes in three main steps: first, we guarantee that each instant terminates 
(Section 1.1.1(1 . second we bound the size of the computed values as a function of the size 
of the parameters at the beginning of the instant (Section 13.21) . and third we combine the 
termination and size analyses to obtain polynomial bounds on space and time (Section l3.3|) . 
As we progress in our analysis, we refine the techniques we employ. Termination is reduced 
to the general problem of finding a suitable well-founded order over first-order terms. 
Bounding the size of the computed values is reduced to the problem of synthesizing a 
quasi-interpretation. Finally, the problem of obtaining polynomial bounds is attacked 
by combining recursive path ordering termination arguments with quasi-interpretations. 
We selected these techniques because they are well established and they can handle a 
significant spectrum of the programs we are interested in. It is to be expected that other 
characterisations of complexity classes available in the literature may lead to similar results. 

3.1 Termination of the Instant 

We recall that a reduction order > over first-order terms is a well-founded order that is 
closed under context and substitution: t > s implies C[t] > C[s] and at > as, where C is 
any one hole context and a is any substitution (see, e.g, 0 )- 

Definition 11 (termination condition) We say that a system satisfies the termination 
condition if there is a reduction order > such that all constraints of index 0 associated with 
the system hold in the reduction order. 

In this section, we assume that the system satisfies the termination condition. As 
expected this entails that the evaluation of closed expressions succeeds. 

Proposition 12 Let e be a closed expression. Then there is a value v such that e v and 
e > v with respect to the reduction order. 

Moreover, the following proposition states that a behaviour will always return the 
control to the scheduler. 

Proposition 13 (progress) Let b be an instance of a control point. Then for all stores 
s, there exist X, b' and s' such that ( b , s) (bs'). 

Finally, we can guarantee that at each instant the system will reach a configuration in 
which the scheduler detects the end of the instant and proceeds to the reinitialisation of 
the store and the status (as specified by rule (s 2 )). 

Theorem 14 (termination of the instant) All sequences of system reductions involv¬ 
ing only rule (si) are finite. 
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Proposition ^1 and Theorem [Q] are proven by exhibiting a suitable well-founded mea¬ 
sure which is based both on the reduction order and the fact that the number of reads a 
thread may perform in an instant is finite. 


Example 15 (monitor max value) We consider a recursive behaviour monitoring the 
register i (acting as a fifo channel) and parameterised on a number x representing the 
largest value read so far. At each instant, the behaviour reads the list l of natural numbers 
received on i and assigns to o the greatest number in x and l. 


/(*) 

/i(*) 
max{x , y ) 


maxi (l, x) 


yield.read ^ i with l => f\{maxl{l,x)) 
o := x.next.f(x) 
match x with s(x') 

then match y with s(y') then s{max{x', y')) else s(x') 
else y 

match l with cons (y, l') then maxi(l 1 , max(x, y)) else x 


It is easy to prove the termination of the thread by recursive path ordering, where the 
function symbols are ordered as f + > ff > maxi > max, the arguments of maxi are 
compared lexicographically from left to right, and the constructor symbols are incomparable 
and smaller than any function symbol. 


3.2 Quasi-interpretations 

Our next task is to control the size of the values computed by the threads. To this end, 
we propose a suitable notion of quasi-interpretation (cf. [IjQI 3;^). 

Definition 16 (assignment) Given a program, an assignment q associates with construc¬ 
tors and function symbols, functions over the non-negative reals R + such that: 

(1) If c is a constant then q c is the constant 0. 

(2) If c is a constructor with arity n > 1 then q c is a function in (R + ) n — *• R + such that 

q c (x i, ...,x n ) = d+ E, e i ,_ n Xi, for some d> 1. 

(3) If f is a function (name) with arity n then qj : (R + ) n — > R + is monotonic and for all 

i G l..n we have qf(x\, ..., x n ) > x*. 

An assignment q is extended to all expressions e as follows, giving a function expression 
q e with variables in Var(e): 

Qx x 1 Qc(ei,...,e„) — Qc (*?ei i ■ ■ ■ > Qe n ) j Qf{e \,...,e n ) — QfiQeii ■ ■ ■ j Qe„) • 

Here q x is the identity function and, e.g., q c (q ei , • • •, q e „) is the functional composition of 
the function q c with the functions q ei ,... ,q en . It is easy to check that there exists a constant 
5 q depending on the assignment q such that for all values v we have |u| < q v < h q ■ |u|. 
Thus the quasi-interpretation of a value is always proportional to its size. 
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Definition 17 (quasi-interpretation) An assignment is a quasi-interpretation, if for 
all constraints associated with the system of the shape /(p) >- t e, with i e (0,1}, the 
inequality g/( p ) > q e holds over the non-negative reals. 

Quasi-interpretations are designed so as to provide a bound on the size of the computed 
values as a function of the size of the input data. In the following, we assume given a 
suitable quasi-interpretation, q, for the system under investigation. 

Example 18 With reference to Examples® and 1 1 FA the following assignment is a quasi¬ 
interpretation (the parameter i corresponds to the label of the read instruction in the body 
of f). We give no quasi-interpretations for the function exp because it fails the read once 
condition: 

<?nil = Qz = 0 , q s (x) = X + 1 , gconsOM) = x + l + 1 , q d bie(x) = 2 • X , 
q f +(x,i) = x + i , q f +(x) = x , q ma xl(x,y) = q max (x,y) = max(x,y) . 

One can show m that in the purely functional fragment of our language every value 
v computed during the evaluation of an expression /(iq, ...,v n ) satisfies the following 
condition: 


M < q v < Qf(vi,...,v n ) = qf{qvi,---,Qvn) < 9/(^ ' M, • • •, S q • |u„|) . (1) 

We generalise this result to threads as follows. 

Theorem 19 (bound on the size of the values) Given a system of synchronous threads 
B, suppose that at the beginning of the instant B\{i ) = /(v) for some thread index i. Then 
the size of the values computed by the thread i during an instant is bounded by g/+( v , u ) 
where u are the values contained in the registers at the time they are read by the thread (or 
some constant value, if they are not read at all). 

Theorem HH1 is proven by showing that quasi-interpretations satisfy a suitable invariant. 
In the following corollary, we note that it is possible to express a bound on the size of the 
computed values which depends only on the size of the parameters at the beginning of the 
instant. This is possible because the number of reads a system may perform in an instant 
is bounded by a constant. 

Corollary 20 Let B be a system with m distinct read instructions and n threads. Suppose 
Bfai) = fiiyi) for i e Z n . Let c be a bound of the size of the largest parameter of the 
functions fa and the largest default value of the registers. Suppose h is a function bounding 
all the quasi-interpretations, that is, for all the functions ffa we have h(x) > qj+(x,... ,x) 
over the non-negative reals. Then the size of the values computed by the system B during 
an instant is bounded by h n ' m+1 (c). 
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Example 21 The n-m iterations of the function h predicted by Corollaru\2(A correspond to 
a tight bound, as shown by the following example. We assume n threads and one register, 
r, of type nat with default value z. The control of each thread is described as follows: 

f{x o) = read rwithx\ r := dble(max(x i,xq)). 
read r with X2 =>■ r := dble(x 2 ). 

read rwithx m => r := dble[x m ).next.f{dble(x m )) . 

For this system we have c > |x 0 | and h{x) = qdbie{x ) = 2 ■ x. It is easy to show that, 
at the end of an instant, there have been n ■ m assignments to the register r (m for every 
thread in the system) and that the value stored in r is dble n m (x 0 ) of size 2 n ' m ■ |x 0 |- 

3.3 Combining Termination and Quasi-interpretations 

To bound the space needed for the execution of a system during an instant we also need to 
bound the number of nested recursive calls, i.e. the number of frames that can be found on 
the stack (a precise definition of frame is given in the following Section 0 . Unfortunately, 
quasi-interpretations provide a bound on the size of the frames but not on their number 
(at least not in a direct implementation that does not rely on memoization). One way 
to cope with this problem is to combine quasi-interpretations with various families of 
reduction orders mm- In the following, we provide an example of this approach based 
on recursive path orders which is a widely used and fully mechanizable technique to prove 
termination [5]. 

Definition 22 We say that a system terminates by LPO, if the reduction order associated 
with the system is a recursive path order where: (1) symbols are ordered so that function 
symbols are always bigger than constructor symbols and two distinct constructor symbols 
are incomparable; (2) the arguments of function symbols are compared with respect to the 
lexicographic order and those of constructor symbols with respect to the product order. 

Note that because of the hypotheses on constructors, this is actually a special case of 
the lexicographic path order. For the sake of brevity, we still refer to it as LPO. 

Definition 23 We say that a system admits a polynomial quasi-interpretation if it has a 
quasi-interpretation where all functions are bounded by a polynomial. 

The following property is a central result of this paper. 

Theorem 24 If a system B terminates by LPO and admits a polynomial quasi-interpretation 
then the computation of the system in an instant runs in space polynomial in the size of 
the parameters of the threads at the beginning of the instant. 

The proof of Theorem [22] is based on Corollary [221 that provides a polynomial bound 
on the size of the computed values and on an analysis of nested calls in the LPO order 
that can be found in Dm. The point is that the depth of such nested calls is polynomial 
in the size of the values and that this allows to effectively compute a polynomial bounding 
the space necessary for the execution of the system. 
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Example 25 We can check that the order used in Exa/rrvnleM 41 for the functions f + , ff , max 
and maxi is indeed a LPO. Moreover, from the quasi-interpretation given in Example QU 
we can deduce that the function h(x ) has the shape a-x + b (it is affine). In practice, 
many useful functions admit quasi-interpretations bound by an affine function such as the 
max-plus polynomials considered in SET 

The combination of LPO and polynomial quasi-interpretation actually provides a char¬ 
acterisation of PSPACE. In order to get to PTIME a further restriction has to be imposed. 
Among several possibilities, we select one proposed in HU. We say that the system termi¬ 
nates by linear LPO if it terminates by LPO as in definition Inland moreover if in all the 
constraints /(p) e or / + (p) 9 + ( e ) °f index 0 there is at most one function symbol 
on the right hand side which has the same priority as the (unique) function symbol on the 
left-hand side. For instance, the Example Hal falls in this case. In op. cit., it is shown by a 
simple counting argument that the number of calls a function may generate is polynomial 
in the size of its arguments. One can then restate theorem [2U by replacing LPO with linear 
LPO and PSPACE with PTIME. 

We stress that these results are of a constructive nature, thus beyond proving that a system 
‘runs in PSPACE (or PTIME)’, we can extract a definite polynomial that bounds the size 
needed to run a system during an instant. In general, the bounds are rather rough and 
should be regarded as providing a qualitative rather than quantitative information. 

In the purely functional framework, M. Hofmann m has explored the situation where 
a program is non-size increasing which means that the size of all intermediate results is 
bounded by the size of the input. Transferring this concept to a system of threads is 
attractive because it would allow to predict the behaviour of the system for arbitrarily 
many instants. However, this is problematic. For instance, consider again example [23 
By Theorem 1241 we can prove that the computation of a system running the behaviour 
f(x o) in an instant requires a space polynomial in the size of xq. Note that the parameter 
of / is the largest value received so far in the register i. Clearly, bounding the value of 
this parameter for arbitrarily many instants requires a global analysis of the system which 
goes against our wish to produce a compositional analysis in the sense explained in the 
Introduction. An alternative approach which remains to be explored could be to develop 
linguistic tools and a programming discipline that allow each thread to control locally the 
size of its parameters. 

4 A Virtual Machine 

We describe a simple virtual machine for our language thus providing a concrete intuition 
for the data structures required for the execution of the programs and the scheduler. 

Our motivations for introducing a low-level model of execution for synchronous threads are 
twofold: (i) it offers a simple formal definition for the space needed for the execution of an 
instant (just take the maximal size of a machine configuration), and (ii) it explains some 
of the elaborate mechanisms occurring during the execution, like the synchronisation with 
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the read instruction and the detection of the end of an instant. A further motivation which 
is elaborated in Section PI is the possibility to carry on the static analyses for resource 
control at bytecode level. The interest of bytecode verification is now well understood, and 
we refer the reader to j23l2Ej. 

4.1 Data Structures 

We suppose given the code for all the threads running in a system together with a set 
of types and constructor names and a disjoint set of function names. A function name / 
will also denote the sequence of instructions of the associated code: f[i\ stands for the i th 
instruction in the (compiled) code of / and |/| stands for the number of instructions. 

The configuration of the machine is composed of a store s, that maps registers to their 
current values, a sequence of records describing the state of each thread in the system, and 
three local registers owned by the scheduler whose role will become clear in Section 14.31 
A thread identifier, t, is simply an index in Z n . The state of a thread t is a pair (st t , M t ) 
where st t is a status and M t is the memory of the thread. A memory M is a sequence 
of frames, and a frame is a triple (/, pc, £) composed of a function name, the value of the 
program counter (a natural number in l..|/|), and a stack of values £ = Vi---Vk- We 
denote with \£\ the number of values in the stack. The status of a thread is defined as 
in the source language, except for the status W which is refined into W(j, n ) where: j is 
the index where to jump at the next instant if the thread does not resume in the current 
instant, and n is the (logical) time at which the thread is suspended (cf. Section 14.3(1 . 

4.2 Instructions 

The set of instructions of the virtual machine together with their operational meaning is 
described in Table [T| All instructions operate on the frame of the current thread t and the 
memory M t — the only instructions that depend on or affect the store are read and write. 
For every segment of bytecode, we require that the last instruction is either return, stop 
or tcall and that the jump index j in the instructions branch c j and wait j is within 
the segment. 

4.3 Scheduler 

In Table [21 we describe a simple implementation of the scheduler. The scheduler owns three 
registers: (1) tid that stores the identity of the current thread, (2) time for the current time, 
and (3) wtime for the last time the store was modified. The notion of time here is of a 
logical nature: time passes whenever the scheduler transfers control to a new thread. Like 
in the source language, s Q denotes the store at the beginning of each instant. 

The scheduler triggers the execution of the current instruction of the current thread, whose 
index is stored in tid, with a call to nm(tid). The call returns the label X associated with 
the instruction in Table □ By convention, take X = e when no label is displayed. If X ^ e 
then the scheduler must take some action. Assume tid stores the thread index t. We denote 
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Table 1: Bytecode instructions 


f[pc\ 

Current memory Following memory 

load k 

M ■ 

(/, pc,l- 

v ■ 

£') —> M ■ (f,pc + 1,£ ■ v ■ £' ■ v), \£ = 

k- 

1 

branch c j 

M ■ 

(/, PcJ- 

c(vi ,... ,v n )) M ■ (f,pc + 1,£ ■ V! ■ ■ -v 

n) 


branch c j 

M ■ 

(/, pcj- 

d(. 

..)) - M-(f,j,£- d(...)) c/d 



build c n 

M ■ 

(/, pcj- 

Vl 

■■v n ) ->• M-(f,pc + l,£-c(vi,...,v n 

)) 


call g n 

M ■ 

(/, pcj- 

V\ 

Vn) ->• M ■ (f,pc,£ -Vl ■■■v n ) ■ (g, 1, 

Vl ■ 

• V n ) 

tcall g n 

M ■ 

(/, pc,t- 

Vl 

■■Vn) -> M ■ (9, Ml ■■■V n ) 



return 

M ■ 

(g,pc',£' 

• V 

)-(f,PC,£-v ) -> M-(g,pc' + l,£'-v) 

, ar 

(/) = |v'| 

read r 

(M 

■ (f,pc,£] 


(Af- (f,pc + \,£ ■ s(r)), s) 



read k 

(M 

■ (f,pc,£ 

• r • 

£'),s) -»• (M • (f,pc + 1,£ ■ r ■£' ■ s(r)) 

,s), 

\t\ =k- 1 

write r 

(M 

■ U,pc,£ 

•v 

,s) -»• (M ■ (f,pc + l,£),s[v/r]) 



write k 

(M 

■ ( f,pc,£ 

■ r • 

£! ■ t),s) —> (M • (/, pc + 1,1 ■ r ■ £'),s[v 

/r]), 

\£\ = k- 1 

stop 

M ■ 

if, pc,£) 

jS^ 

e 



yield 

M ■ 

if, pc,t) 

R 

M ■ (f, pc + 1,£) 



next 

M ■ 

if, pc,£) 

N 

M ■ (f, pc + 1,£) 



wait j 

M ■ 

if, pc,£- 

v ) 

^ M-(f,j,£) 




pc t]d the program counter of the top frame (/, pc t , £) in M t , if any, / tid the instruction f[pc t ] 
(the current instruction in the thread) and st t \d the state st t of the thread. Let us explain 
the role of the status W(j, n) and of the registers time and wtime. We assume that a thread 
waiting for a condition to hold can check the condition without modifying the store. Then 
a thread waiting since time m may pass the condition only if the store has been modified 
at a time n with m < n. Otherwise, there is no point in passing the control to it 1 . With 
this data structure we also have a simple method to detect the end of an instant, it arises 
when no thread is in the running status and all waiting threads were interrupted after the 
last store modification occurred. 

In models based on preemptive threads, it is difficult to foresee the behaviour of the 
scheduler which might depend on timing information not available in the model. For 
this reason and in spite of the fact that most schedulers are deterministic, the scheduler 
is often modelled as a non-deterministic process. In cooperative threads, as illustrated 
here, the interrupt points are explicit in the program and it is possible to think of the 
scheduler as a deterministic process. Then the resulting model is deterministic and this 
fact considerably simplifies its programming, debugging, and analysis. 

1 Of course, this condition can be refined by recording the register on which the thread is waiting, the 
shape of the expected value,... 
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Table 2: An implementation of the scheduler 


for t in Z„ do { stt := A; } 
s := s 0 ;tid := time := wtime := 0; 
while (tid G Z n ) { 

if Aid = (write _) then wtime := time; 

if Aid = (wait j ) 

then sAid := W(pc tld + l,time); 

X := run( tid); 
if X ^ e then { 

if X W then sAid : = A; 
tid := Af(tid. st); 
if tid G Z n 

then { sAid := A; time := time + 1; } 
else { s := s Q ; wtime := time; 
tid := Af(0, st); 
fora 11 i in Z n do { 

if sti = W(j , _) then pc { := j ; 
if sti A A then sti ■= R] } } } 


(initialisation) 

(the initial thread is of index 0) 

(loop until all threads are blocked) 
(record store modified) 

(save continuation for next instant) 
(run current thread) 

(update thread status) 

(compute index of next active thread) 
(test whether all threads are blocked) 
(if not, prepare next thread to run) 
(else, initialisation of the new instant) 
(select thread to run, starting from 0) 


Conditions on M : 

If A/"(tid, st) = k G Z n then stf,. = R or (stk = W(j,n) and n < wtime) 

If A/"(tid, st) £ Z n then \/k G Z n ( stk A R an d 

(stk = W(j,n) implies n > wtime)) 
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Table 3: Compilation of source code to bytecode 


Compilation of expression bodies: 


C(e,??) 


(match x with c(y) 
y then eb\ else eb 2 


C'(e, rj) • return 

’ (branch c j ) • C(eb\, rj • y) • if r\ = rj' ■ x 
< C j--C(eb 2 ,ri )) 

(load i(x,ri)) ■ (branch c j) • o.w. 
k C(ebi,vy) ■ (j : C(eb 2 ,r]-x)) 


Auxiliary compilation of expressions: 


C'(x,r]) = (load i(x, rj)) 

C'(c(ei,... ,e n ),r]) = C"(ei, rj) C'(e n , rj) • (build c n) 
C'{f{ei,...,e n ),rj) = C'(e 1 , rj) ■ ... ■ C'(e n , rj) • (call / n) 


Compilation of behaviours: 


C(stop , 77) 

, e n ), rj) 
C (yield.b, rj) 
C(next.f(e), rj) 
C(g := e.b, rj) 


C j match x with c(y) 
l then b\ else 62 ’ ^ 


C 


C 


read g with ■ ■ ■ | cg(y g) =4> bp 
Dk ^ bk ' ‘ ‘ 

read g with ■ ■ ■ \ cc(yr) => bt 

■••m =>9{e) 


= < 


,v = 


,V = 


= stop 

= C'(e\,rj) ■ ■ ■ C'(e n , rj) ■ (tcall / n) 

= yield • C(b, rj) 

= next • C(f(e),rj) 

= C'(e, rj) • (write i(g, rj)) ■ C(b, rj) 

(branch c j) ■ C(b\, rj ■ y) • if 77 = rj ■ x 
(j : C(b 2 ,r])) 

(load i(x,rj)) • (branch c j) • o.w. 

C(h,v -y) ■ (j ■■ C(b 2 ,rj ■ x)) 
jo ■ (read i(g,rj)) 
ji ■ (branch q j e+1 ) ■ C(b e , 77 ■ y e )- 
jt+i ■ ■■■jk- C(b k ,r 7 ■ y k ) 
jo ■ (read i(g,rj)) 
ji ■ (branch q j e+1 ) ■ C(b £ , rj ■ y e )- 
ji+i ■ ■■■jn- (wait j 0 ) ■ C(g(e),rj) 
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4.4 Compilation 

In Table 01 we describe a possible compilation of the intermediate language into bytecode. 
We denote with 77 a sequence of variables. If a; is a variable and 77 a sequence then i(x,rj) 
is the index of the rightmost occurrence of x in 77 . For instance, i(x,x ■ y ■ x) = 3. By 
convention, i(r, 77 ) = r if r is a register constant. We also use the notation j : C(be,rj) to 
indicate that j is the position of the first instruction of C{be, rf). This is just a convenient 
notation since, in practice, the position can be computed explicitly. With every function 
definition f(x 1 ,..., x n ) = be we associate the bytecode C(be,x 1 • • • x n ). 

Example 26 (compiled code) We show below the result of the compilation of the func¬ 
tion alarm in Example [/} 


1 : branch s 12 

6 

: load 1 

11 : tcall alarm 2 

2 : read sig 

7 

: tcall alarm 2 

12 : build prst 0 

3 : branch prst 8 

8 

: wait 2 

13 : write ring 

4 : next 

9 

: load 1 

14 : stop 

5 : load 1 

10 

: load 2 



4.5 Control Flow Analysis Revisited 

As a first step towards control flow analysis, we analyse the flow graph of the bytecode 
generated. 

Definition 27 (flow graph) The flow graph of a system is a directed graph whose nodes 
are pairs (/, i) where f is a function name in the program and i is an instruction index, 
1 < i < \f\, and whose edges are classified as follows: 

Successor: An edge ((/, i), (/, 7 + 1)) if f[i] is a load, branch, build, call, read, write, 
or yield instruction. 

Branch: An edge ((/, z), (/, j)) if f[i\ = branch c j. 

Wait: An edge ((/,?), (/, j)) if f\i\ = wait j. 

Next: An edge ((/, i), (/, i + 1)) if f[i] is await or next instruction. 

Call: An edge ((/, i), ( 77 , 1)) if f[i\ = call g n or f[i] = tcall g n. 

The following is easily checked by inspecting the compilation function. Properties Tree 
and Read-Wait entail that the only cycles in the flow graph of a function correspond to 
the compilation of a read instruction. Property Next follows from the fact that, in a 
behaviour, an instruction next is always followed by a function call /(e). Property Read- 
Once is a transposition of the read once condition (Section 12. 1 1) at the level of the bytecode. 


22 


Proposition 28 The flow graph associated with the compilation of a well-formed system 
satisfies the following properties: 

Tree: Let G' be the flow graph without wait and call edges. Let G'j be the full subgraph of 
G' whose nodes have the shape ( f,i ). Then G / is a tree with root (/, 1). 

Read-Wait: If /[?'] = wait j then f[j) = read r and there is a unique path from ( f,j) to 
(/, i ) and in this path, every node corresponds to a branch instruction. 

Next: Let G' be the flow graph without call edges. If ((/, i ), (/, i + 1)) is a next edge then 
for all nodes (/, j) accessible from ( f,i + 1), f[j] is not a read instruction. 

Read-Once: Let G' be the flow graph without wait edges and next edges. If the source 
code satisfies the read once condition then there is no loop in G' that goes through a 
node ( f,i ) such that f[i\ is a read instruction. 

In |I], we have presented a method to perforin resource control verifications at bytecode 
level. This work is just concerned with the functional fragment of our model. Here, 
we outline its generalisation to the full model. The main problem is to reconstruct a 
symbolic representation of the values allocated on the stack. Once this is done, it is rather 
straightforward to formulate the constraints for the resource control. We give first an 
informal description of the method. 

1. For every segment / of bytecode instructions with, say, formal parameters Xi,... ,x n 
and for every instruction i in the segment, we compute a sequence of expressions 
ei ■ ■ ■ e m and a substitution o\ 

2. The expressions (e^gi.. m are related to the formal parameters via the substitution 
a. More precisely, the variables in the expressions are contained in ax\, ..., ax n and 
the latter forms a linear pattern. 

3. Next, let us look at the intended usage of the formal expressions. Suppose at run 
time the function / is called with actual parameters u\,...,u n and suppose that 
following this call, the control reaches instruction i with a stack i. Then we would 
like that: 

• The values u\,... ,u n match the pattern ax i,..., ax n via some substitution p. 

• The stack I contains exactly m values v\,... ,v m whose types are the ones of 
ei,..., e m , respectively. 

• Moreover p{ef) is an over-approximation (w.r.t. size and/or termination) of 
the value v, tl for i = 1,... ,m. In particular, if e* is a pattern, we want that 
p(ef) = Vi. 

We now describe precisely the generation of the expressions and the substitutions. This 
computation is called shape analysis in [I]. For every function / and index i such that f[i\ 
is a read instruction we assume a fresh variable x . Given a total order on the function 
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symbols, such variables can be totally ordered with respect to the index (/, i). Moreover, 
for every index i in the code of /, we assume a countable set Xij of distinct variables. 

We assume that the bytecode comes with annotations assigning a suitable type to every 
constructor, register, and function symbol. With every function symbol / of type t —> beh, 
comes a fresh function symbol f + of type t, t' —» beh so that |t'| is the number of read 
instructions accessible from / within an instant. Then, as in the definition of control points 
(Section 12 .211 . the extra arguments in f + corresponds to the values read in the registers 
within an instant. The order is chosen according to the order of the variables associated 
with the read instructions. 

In the shape analysis, we will consider well-typed expressions obtained by composition of 
such fresh variables with function symbols, constructors, and registers. In order to make 
explicit the type of a variable x we will write x l . 

For every function /, the shape analysis computes a vector a = a 1 ,, cr\f\ of substitutions 
and a vector E = Ei,... ,E\f\ of sequences of well-typed expressions. We let E, and a, 
denote the sequence E t and the substitution a* respectively (the i th element in the vector), 
and E,[/c] the k th clement in Ej. We also let hi = |Ej| be the length of the i th sequence. 
We assume oq = id and E| = xf , • • • x* n n , if / : ti ,..., t n —> (3 is a function of arity n. 

The main case is the branch instruction: 


m = 

Conditions 

branch c j 

c : t —> t, Ej = E ■ e, e : t, 
and either e = c(e), (T l+ \ = <7j, 
or e = d(e), c 7^ d, aj = Oi , Ej 
or e = x l , Oj = <jj, Ej = Ej, a' 
cjj+i = o' 0 cTj, Ej + i = a\E) • 

Ej +1 = E ■ e 

= E i 

= [<x i f +lh .,...,x t ff lhi+i )/x\, 

x i+l,hi ' ' ' x i+l,hi + i • 


The constraints for the remaining instructions are given in Table @J where it is assumed 
that cq+i = cq except for the instructions tcall and return (that have no direct successors 
in the code of the function). 

Example 29 We give the shape of the values on the stack (a side result of the shape analy¬ 
sis) for the bytecode obtained from the compilation of the function f defined in Example 


Instruction 

Shape 

Instruction 

Shape 

1 : yield 

X 

4 : 

call maxi 2 

x ■ l ■ x 

2 : read i 

X 

5 : 

call fi 1 

x ■ maxl(l, x) 

3 : load 1 

x ■ l 

6 : 

return 

x ■ fi(maxl(l, x)) 


Note that the code has no branch instruction, hence the substitution a is always the identity. 
Once the shapes are generated it is rather straightforward to determine a set of constraints 
that entails the termination of the code and a bound on the size of the computed values. 
For instance, assuming the reduction order is a simplification order, it is enough to require 
that f + (x,l ) > fi(maxl(l,x)), i.e. the shape of the returned value, fi(maxl(l,x)), is less 
than the shape of the call, f + (x,l). 
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Table 4: Shape analysis at bytecode level 


m = 

Conditions 

load k 

k £ l..hj, Ej_|_i = Ej ■ Ej[A;] 

build c n 

c : t —> t, E i = E ■ e, |e = n, e : t. Ej + i 

call g n 

g : t —> t, Ej = E ■ e, e = n, e : t, Ej+i 

tcall g n 

g : t —> (3, Ej = E ■ e. e = n, e : t 

return 

/ : t — > t, Ej = E ■ e, e : t 

read r 

r : Ref(t), E i+ i = Ej ■ x f f i 

read k 

k £ 1 ..hi, Ej[fc] : Ref(t), E i+ i = Ej • x ^ 

write r 

r : Refit), E, = E ■ e, e : t, E i+ i = E 

write k 

k £ 1 ..hi, Ej[fe] : Refit), Ej = E ■ e, e : t, 

yield 

Ej+i = Ej 

next 

Ej+i = Ej 

wait j 

Ei — E j • x j j , E*+i = Eji, di = dj 


= E • c(e) 


If one can find a reduction order and an assignment satisfying the constraints generated 
from the shape analysis then one can show the termination of the instant and provide 
bounds on the size of the computed values. We refrain from developing this part which is 
essentially an adaptation of Section 01 at bytecode level. Moreover, a detailed treatment 
of the functional fragment is available in (Tj . Instead, we state that the shape analysis 
is always successful on the bytecode generated by the compilation function described in 
Tabic 01 (see Appendix IB. 81) . This should suggest that the control flow analysis is not overly 
constraining though it can certainly be enriched in order to take into account some code 
optimisations. 

Theorem 30 The shape analysis succeeds on the compilation of a well-formed program. 


5 Conclusion 

The execution of a thread in a cooperative synchronous model can be regarded as a sequence 
of instants. One can make each instant simple enough so that it can be described as a 
function — our experiments with writing sample programs show that the restrictions we 
impose do not hinder the expressivity of the language. Then well-known static analyses 
used to bound the resources needed for the execution of first-order functional programs can 
be extended to handle systems of synchronous cooperative threads. We believe this provides 
some evidence for the relevance of these techniques in concurrent/embedded programming. 
We also expect that our approach can be extended to a richer programming model including 
more complicated control structures. 

The static analyses we have considered do not try to analyse the whole system. On 
the contrary, they focus on each thread separately and can be carried out incrementally. 
Moreover, it is quite possible to perform them at bytecode level. These characteristics are 
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particularly interesting in the framework of ‘mobile code’ where threads can enter or leave 
the system at the end of each instant as described in |12j . 
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A Readers-Writers and Other Synchronisation Pat¬ 

terns 

A simple, maybe the simplest, example of synchronisation and resource protection is the 
single place buffer. The buffer (initially empty) is implemented by a thread listening to 
two signals. The first on the register put to fill the buffer with a value if it is empty, the 
second on the register get to emit the value stored in the buffer by writing it in the special 
register result and flush the buffer. In this encoding, the register put is a one place channel 
and get is a signal as in Example d Moreover, owing to the read once condition, we are 
not able to react to several put/get requests during the same instant — only if the buffer 
is full can we process one get and one put request in the same instant. Note that the value 
of the buffer is stored on the function call to full(v), hence we use function parameters as 
a kind of private memory (to compare with registers that model shared memory). 

empty () = read put with full(ra) => next.full(x) \ [J\ => empty () 

full(x) = read get with prst =>- result := x.yield, empty () | [_] =>■ full(x) 

Another common example of synchronisation pattern is a situation where we need to 
protect a resource that may be accessed both by ‘readers’ (which access the resource with¬ 
out modifying it) and ‘writers’ (which can access and modify the resource). This form of 
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access control is common in databases and can be implemented using traditional synchro¬ 
nisation mechanisms such as semaphores, but this implementation is far from trivial [27] . 
In our encoding, a control thread secures the access to the protected resource. The other 
threads, which may be distinguished by their identity id (a natural number), may initiate a 
request to access / release the resource by sending a special value on the dedicated register 
req. The thread regulating the resource may acknowledge at most one request per instant 
and allows the sender of a request to proceed by writing its id on the register allow at 
the next instant. The synchronisation constraints are as follows: there can be multiple 
concurrent readers, there can be only one writer at any one time, pending write requests 
have priority over pending read requests (but do not preempt ongoing read operations). 

We define a new algebraic datatype for assigning requests: 

request = startRead(nat) | startWrite(naf) | endRead | endWrite | none 

The value startRead(zd) indicates a read request from the thread id, the other construc¬ 
tors correspond to requests for starting to write, ending to read or ending to write — the 
value none stands for no requests. A startRead operation requires that there are no pending 
writes to proceed. In that case we increment the number of ongoing readers and allow the 
caller to proceed. By contrast, a startWrite puts the monitor thread in a state waiting to 
process the pending write request (function pwrite ), which waits for the number of readers 
to be null and then allows the thread that made the pending write request to proceed. An 
endRead and endWrite request is always immediately acknowledged. 

The thread protecting the resource starts with the behaviour onlyreader{ z), defined in Ta¬ 
ble El meaning the system has no pending requests for reading or writing. The behaviour 
only reader (x) encodes the state of the controller when there is no pending write and x 
readers. In a state with x pending readers, when a startWrite request from the thread 
id is received, the controller thread switches to the behaviour pwrite(id,x), meaning that 
the thread id is waiting to write and that we should wait for x endRead requests before 
acknowledging the request to write. 

A thread willing to read on the protected resource should repeatedly try to send its request 
on the register req then poll the register allow, e.g., with the behaviour askRead(id).read allow 
with id =r- • • • where askRead(id) is a shorthand for read req with none =>■ req := startRead(zd) 
The code for a thread willing to end a read session is similar. It is simple to change our 
encoding so that multiple requests are stored in a hfo queue instead of a one place buffer. 

B Proofs 

B.l Preservation of Control Points Instances 

Proposition 31 0 Suppose ( B,s,i ) — > ( B',s',i ') and that for all thread indexes j G Z n , 
Di (j ) is an instance of a control point. Then for all j G Z n , we have that B[(j) is an 
instance of a control point. 
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Table 5: Code for the Readers-Writers pattern 


match x with s(x') then read req with 
end Read => next, only reader (x 1 ) 

| startWrite(y) =4> next.pwrite(y, s(x')) 

| startRead(y) =>■ next. allow := y. onlyreader (s{s(x'))) 

| [_] =>- onlyreader (s{x')) 
else read req with 

startWrite(y) next.allow := y.pwrite(y, z) 

| startRead(y) =>- next.allow := y.onlyreader{s{z)) 

| [_] =>- onlyreader (z) 

match x with s(x') then 

match x' with s(x") then read req with 
endRead next.pwrite(id,s(x")) 

| [_] pwrite(id, s(s(x // ))) 
else read req with 

endRead =>- next.allow := id.pwrite(id,z) 

| [_] pwrite(id, s(z)) 
else read req with 

endWrite next.onlyreader(z) 

| [_] => pwrite(id, z) 

Proof. Let (/(p), be,i ) be a control point of an expression body or of a behaviour. In 
Table |H1 we reformulate the evaluation and the reduction by replacing expression bodies 
or behaviours by triples (/(p), be, a) where (/(p), be,i ) is a control point and cr is a sub¬ 
stitution mapping the variables in p to values. By convention, we take a(r) = r if r is a 
register. 

We claim that the evaluation and reduction in Table El are equivalent to those presented 
in Section El in the following sense: 

1- (/(P), e 0 , a) -l| v iff cre 0 -l| v. 

2- (/ + (p), b 0 , s, a) 4 0+(q), b' 0 , s', a') iff ab 0 4 a 'b' 0 . 

In the following proofs we will refer to the rules in Table El The revised formulation 
makes clear that if b is an instance of a control point and (b, s)—>(b', s') then b’ is an 
instance. It remains to check that being an instance is a property preserved at the level 
of system reduction. We proceed by case analysis on the last reduction rule used in the 
derivation of ( B,s,i ) —> ('). 

(si) One of the threads performs one step. The property follows by the analysis on 
behaviours. 

(S 2 ) One of the threads performs one step. Moreover, the threads in waiting status take the 


onlyreader (x) = 


pwrite{id, x) 
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Table 6: Expression body evaluation and behaviour reduction revised 


( e 4) 


( e o) 


( e 2) 


(f{p),x,cr) H cr(x 

(f(p),ej,cr) -U- Vi i £ 1 ..TI 
(/(p), c (e),cr) 11 c(v) 


(ei) 


(/(p),r,cr) 11 r 
(/(p)) e ij a ) IN* i -n, 

(e 3 ) ^(x) = eb, (g(x), eb, [v/x]) 11 v 


(/(p), fi'(e), cr) 11 v 
a(x) =d(. 

(/(P), eb 2 ,c r) 11 v 


a(x) = c(v), 

(/([< x )fa]p). eb i, [v/x] o a) 11 v _ 

match x \ (es) / match x 

/(p), with c(x) , cr 11 v /(p), with c(x) , cr | JJ. r 

i/ien e&i etee e& 2 / \ then eb\ else eb 2 

(bi) 


(/ + (p), stop, cr, s) -> (/ + (p), stop , cr, s) 
^ ^ (/ + (p), 2/toW. b, cr, s) 4 (/+(p), b, a, s ) 


(b 3 ) 


(/ + (p),«ead.p(e),cr,s) (/ + (p), 5(e), cr, s) 


cr(x) = c(v), (/+([c(x)/.x]p),bi, [v/x] ocr,.s) 4 (f+(p'),b',cr',s') 


(hi) 


/ + (p), 


match x with c(x) 
then b\ else b 2 


4 (f+(p'),b',a',s') 

7 \ X / 7 *-|- / /\ 7 / / 


( b 5) 

(be) 
(b 7 ) - 


cr(x) = d(...), c/d, (/+(p),6 2 ,cr, 5) -> (ff(p'),b',a',s') 

., match x with c(x) \ x n ,, , A 

; (P) - then h, else h 2 ’ S ’ a ) (p >’ h '" ' * > 

no pattern matches s(a(g )) 

(/ + (P) , read g with ..., cr, s) ^ (/ + (p), read p with ... ,cr,s) 
cri(p) = s(o~(g)), (/ + (b/y]p),6,0-i o cr, s) 4 (/+(p , ),fe',cr / ,s') 

(f + (p), read {y) g with ■ ■ ■ \ p =>- b \ ..., a, s) 4 (/^(p'), 6', cr', s') 
ere 11 v, p(x) = b, 

(b 8 ) (g + ( x , y g ), b, [v/x], a) {f^(p'),b' ,a' ,s') 

(/ + (P ),g(e),a,s) * (/^(pO^W) 
ere H r, (/ + (p), b, cr, s[v/a{e)\) ^ (ff (p'), 6', cr', s') 
(/ + (p),£ := e.b,a,s ) 4 (/^(p'), 6', a', s') 
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[_] => g(e) branch of the read instructions that were blocking. A thread read £...[[_]=>• 
g(e) in waiting status is an instance of a control point (/ + (p), read g ... | [_] =>• g(e 0 ), j). 

By (C 7 ), (/ + (p), 5 f(e 0 ), 2) is a control point, and g(e) is one of its instances. □ 

B.2 Evaluation of Closed Expressions 

Proposition 32 [7H Let e be a closed expression. Then there is a value v such that e JJ. v 
and e > v with respect to the reduction order. 

As announced, we refer to the rules in Table El We recall that the order > or > refers 
to the reduction order that satisfies the constraints of index 0. We start by proving the 
following working lemma. 

Lemma 33 For all well formed triples, (/(p), eb, a), there is a value v such that (/(p), eb, a) JJ. 
v. Moreover, if eb is an expression then a(eb) > v else /(op) > v. 

PROOF. We proceed by induction on the pair (/(op), eb) ordered lexicographically from 
left to right. The first argument is ordered according to the reduction order and the second 
according to the structure of the expression body. 

eb = x. We apply rule (e 0 ) and a(x) > a(x). 

eb = r. We apply rule (ei) and o(r) = r > r. 

eb = c(ei,..., e n ). We apply rule (e 2 ). By inductive hypothesis, (/(p),e*, cr) JJ. i\ for 
i G l..n and oe* > ry. By dehnition of reduction order, we derive o(c(ei,...,e„)) > 
c(ui,.. ,,v n ). 

eb = f(e i,...,e n ). We apply rule (e 3 ). By inductive hypothesis, (/(p),ej,o) JJ- V{ for 
% G l..n and oe* > v t . By the dehnition of the generated constraints /(p) > g(e), which 
by dehnition of reduction order implies that /(op) > g(ae) > g(v) = p([v/x]x). Thus by 
inductive hypothesis, g(x, eb, [v/a:]) fj. v. We conclude by showing by case analysis that 
3 (ere) > v. 

• eb is an expression. By the constraint we have gfx) > eb, and by inductive hypothesis 
[v/x]e& > v. So g(ae) > g(v) > [v/x]e6 > v. 

• eb is not an expression. Then by inductive hypothesis, g(v) > v and we know 

s/ere) > ^(v). 

eb = match x with c(x) .... We distinguish two cases. 

• a(x) = c(v). Then rule (e 4 ) applies. Let o' = [v/x]ocr. Note that o _, ([c(x)/a:]p) = o-p. 

By inductive hypothesis, we have that (/([c(x)/x]p), eb\,a') JJ- v. We show by case 
analysis that /(ap) > v. 
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— eb i is an expression. By inductive hypothesis, a'(eb i) > v. By the constraint, 
/([c( x )/x]p) > eb 1 . Hence, /(op) = f(cr'[ c(x)/x]p) > a'(ebi). 

— e &2 is not an expression. By inductive hypothesis, we have that /(up) equals 
f(a'[c(x.)/ x )p) > v. 

• cr(x) = d(. ..) with c ^ d. Then rule (e 5 ) applies and an argument simpler than the 
one above allows to conclude. □ 

Relying on Le mm a 1331 we can now prove Proposition 1121 that if e is a closed expression 
and e JJ. v then e > v in the reduction order. Proof. We proceed by induction on the 
structure of e. 

e is value v. Then v v and v > v. 

e = c(ei, ..., e n ). By inductive hypothesis, e* JJ. v t and e* > v t for i e l..n. By definition 
of reduction order, c(e) > c(v). 

e = f(e i,...,e n ). By inductive hypothesis, e* JJ. Vi and e* > v t for i e l..n. Suppose 
/(x) = eb. By Lemma 1331 (/(x), eb, [v/x]) JJ. v and either /(v) > v or /(x) > eb and 
a(eb) > v. We conclude by a simple case analysis. □ 

B.3 Progress 

Proposition 34 G2I Let b be an instance of a control point. Then for all stores s, there 
exists a store s' and a status X such that ( b,s ) —» ( b',s'). 

Proof. We start by defining a suitable well-founded order. If b is a behaviour, then let 
nr {If) be the maximum number of reads that b may perform in an instant. Moreover, let 
ln(b) be the length of b inductively defined as follows: 

ln{ stop) = ln(f(e )) = 0 ln{yield.b) = ln(g := e.b ) = 1 + ln(b) ln(next.f(e )) = 2 
ln(match x with c(x) then b\ else 62 ) = 1 + max{ln{b \), ln{bf)) 
ln{read g with ...]/;,> 6, | ... [_] ->• /(e)) = 1 + max{. .., ln{bi),.. .) 

If the behaviour b is an instance of the control point 7 = (/ + (p), bo, i) via a substitution a 
then we associate with the pair ( 6 , 7 ) a measure: 

M 6 »7) =def (nr(b),f + (ap),ln(b)) . 

We assume that measures are lexicographically ordered from left to right, where the 
order on the first and third component is the standard order on natural numbers and the 
order on the second component is the reduction order considered in study of the termina¬ 
tion conditions. This is a well-founded order. Now we show the assertion by induction on 
n(b, 7 ). We proceed by case analysis on the structure of b. 

b = stop. Rule (bi) applies, with X = S, and the measure stays constant. 
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b = yield.b'. Rule (b 2 ) applies, with X = R , and the measure decreases because ln(b ) 
decreases. 

b = next.b'. Rule (b 3 ) applies, with A" = N , and the measure decreases because ln(b) 
decreases. 

b = match .... Rules (b4) or (b 5 ) apply and the measure decreases because ln(b) de¬ 
creases. 

b = read .... If no pattern matches then rule (b 6 ) applies and the measure is left un¬ 
changed. If a pattern matches then rule (b-) applies and the measure decreases because 
nr(b ) decreases and then the induction hypothesis applies. 

b = g(e). Rule (b 8 ) applies to (/ + (p), (yf(e 0 ), a), assuming e = cre 0 . By Proposition H21 we 
know that e v and e > v in the reduction order. Suppose g is associated to the declara¬ 
tion g (x) = b. The constraint associated with the control point requires / + (p) > g + (e 0 , y g ). 
Then using the properties of reduction orders we observe: 

/+(ap) > g + (ae 0 , y g ) = g + (e, y g ) > ^ + (v, y g ) 

Thus the measure decreases because / + (crp) > g + (v, y g ), and then the induction hypoth¬ 
esis applies. 

b = g := e.b'. By Proposition W2 1 we have e IJ. v. Hence rule (b 9 ) applies, the measure 
decreases because ln[b) decreases, and then the induction hypothesis applies. □ 


Remark 35 We point out that in the proof of proposition 1731 if X = R then the measure 
decreases and if X e {N, S, W} then the measure decreases or stays the same. We use this 
observation in the following proof of Theorem{TJ 


B.4 Termination of the Instant 

Theorem 36 ^^All sequences of system reductions involving only rule (si) are finite. 

Proof. We order the status of threads as follows: R> N, S, W. With a behaviour Bi(i) 
coming with a control point 7 *, we associate the pair p'(i) = (p(Bi(i), 7 *), B 2 (i)) where 
p is the measure defined in the proof of Proposition [T21 Thus pf{i) can be regarded as 
a quadruple with a lexicographic order from left to right. With a system B of n threads 
we associate the measure pb =def (//(O),..., p'[n — 1)) that is a tuple. We compare such 
tuples using the product order. We prove that every system reduction sequence involving 
only rule (si) terminates by proving that this measure decreases during reduction. We 
recall the rule below: 

(Ri(i),s)4(6', s / ), B 2 (i) = R , B'= B[(b',X)/i], Af(B’,s',i) = k 
(B,s,i) ^ (B'[(B' 1 (k),R)/k],s',k) 
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Let B" = B'[(B[(k), R)/k\. We proceed by case analysis on X and B 2 (k). 

If B 2 {k ) = R then p'{k) is left unchanged. The only other case is B' 2 {k ) = W. In this 
case the conditions on the scheduler tell us that i ^ k. Indeed, the thread k must be 
blocked on a read r instruction and it can only be scheduled if the value stored in r has 
been modified, which means than some other thread than k must have modified r. For the 
same reason, some pattern in the read r instruction of Bi(k) matches s'(r), which means 
that the number of reads that B 1 (k) may perform in the current instant decreases and that 
fj/(k) also decreases. 

By hypothesis we have (Bi(i),s) —> ( b',s '), hence by Remark I3H1 y!{i) decreases or stays 
the same. By the previous line of reasoning p!{k) decreases and the other measures ji' (j ) 
stay the same. Hence the measure pb decreases, as needed. □ 

B.5 Bounding the Size of Values for Threads 

Theorem 37 E3 Given a system of synchronous threads B, suppose that at the beginning 
of the instant Bi(i) = /(v) for some thread index i. Then the size of the values computed 
by the thread i during an instant is bounded by g/+( v , u ) where u are the values contained 
in the registers at the time they are read by the thread (or some constant value, if they are 
not read at all). 

In Table IH1 we have defined the reduction of behaviours as a big step semantics. In Tabled 
we reformulate the operational semantics following a small step approach. First, note that 
there are no rules corresponding to (t>i), (b 3 ) or (b 6 ) since these rules either terminate 
or suspend the computation of the thread in the instant. Second, the reduction makes 
abstraction of the memory and the scheduler. Instead, the reduction relation is parame¬ 
terized on an assignment 6 associating values with the labels of the read instructions. 

The assignment 5 is a kind of oracle that provides the thread with the finitely many values 
(because of the read once condition) it may read within the current instant. The assign¬ 
ment 6 provides a safe abstraction of the store s used in the transition rules of Table |bl 
Note that the resulting system represents more reductions than can actually occur in the 
original semantics within an instant. Namely, a thread can write a value v in r and then 
proceed to read from r a value different from v without yielding the control. This kind 
of reduction is impossible in the original semantics. However, since we do not rely on a 
precise monitoring of the values written in the store, this loss of precision does not affect 
our analysis. 

Next we prove that if (/ + (p),6, a) —> s (fi' + (q), b', o') then q f+{a „ orT{p)) > g 9+(cr y q)) over the 
non-negative reals, where a" is either the identity or the restriction of 6 to the label of the 
read instruction in case (b' 7 ). 

Proof. By case analysis on the small step rules. Cases (b^), (b's) and (b'g) are immediate. 
(b' 4 ) The assertion follows by a straightforward computation on substitutions. 

(b' 7 ) Then a"(y) = S(y) = [ay (p)/y\ and recalling that patterns are linear, we note that: 

/ + (K 0 <lKp)) = / + ((^i 0 <?)\p/y]( P))- 


35 


Tabic 7: Small step reduction within an instant 


(b' 2 ) 

(b' 4 ) 

(b' 5 ) 

(b'r) 

(b's) 

(b' 9 ) 


( f + (p), yield.b,cr) -»• s (f + (p),b,a) 

^ Z‘rX X e tX <X) ^ - (nicM/xlp),*., [v/*l ■»») « ID 

(/ + <p). r“; (x, -)-</ + (p)^-) 

(/ + (p),read {y ) £? with • | p => b \ ... ,a) (f + ([p/y]p)ib,cr 1 ° cr) if ( 2 ) 

(/ + (p),3(e),cr) (g + (x,y g ),b, [v/x]) if ere fl v and g(x) = b 

(/ + (p), Q '■= e.b, a) -»• 5 (/ + (p), 6, cr) if ere ff v 
where: (1) = cr(x) = c(v) and ( 2 ) = a\(jp) = 5(y). 


(b'g) By the properties of quasi-interpretations, we know that g CT ( e ) > q v - By the con¬ 
straints generated by the control points, we derive that <?/+( p ) > q g +(e,y g ) over the non¬ 
negative reals. By the substitutivity property of quasi-interpretations, this implies that 
<?/+( CT (p)) > Qg+(a(e,y g ))- Thus we derive, as required: g/+( ff ( P )) > g s +(«r(e,y 9 )) > 5 fl +(v,y s )- D 
It remains to support our claim that all values computed by the thread i during an 
instant have a size bounded by <?/( v .u) where u are either the values read by the thread or 
some constant value. 

Proof. By inspecting the shape of behaviours we see that a thread computes values either 
when writing into a register or in recursive calls. We consider in turn the two cases. 

Writing Suppose (/ + (p, y/), b, cr) —(g + (q), g ■— e.b' , a') by performing a series of reads 
recorded by the substitution cr". Then the invariant we have proved above implies that: 
Qf+((a"ao)(p,y f )) > Q g +(c r'q) over the non-negative reals. If some of the variables in yj are not 
instantiated by the substitution cr", then we may replace them by some constant. Next, 
we observe that the constraint of index 1 associated with the control point requires that 
q g + ( q ) > q e and that if cr(e) v then this implies q g +( a '( q )) > q<r'(e) > q v > |u|. 

Recursive call Suppose (/ + (p, y/), b, a) — (fi ,+ (q), /i(e), cr') by performing a series of 
reads recorded by the substitution cr". Then the invariant we have proved above implies 
that: <2 , /+(( 0 -"oo-)( P ,y / )) > q g +(a '( q )) over the non-negative reals. Again, if some of the variables 
in y f are not instantiated by the substitution cr", then we may replace them by some 
constant value. Next we observe that the constraint of index 0 associated with the control 
point requires that g 5 + (q) > q h +( e , yh ). Moreover, if cr'(e) JJ. v then q g +( a '( q )) > Qh.+y h )) > 
qh+(v, yh ) > q Vi > |uj|, where z\ is any of the values in v. The last inequation relies 
on the monotonicity property of assignments, see property (3) in Definition EH that is 
qh+( z i, • • • , z n) > Zj for all j G l..n. □ 

B.6 Bounding the Size of Values for Systems 

Corollary 38 Let B be a system with m distinct read instructions and n threads. 
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Suppose B\{i) — /j(vj) fori G Z n . Let c be a bound of the size of the largest parameter of the 
functions fi and the largest default value of the registers. Suppose h is a function bounding 
all the quasi-interpretations, that is, for all the functions ff~ we have h(x) > qj+(x,... ,x) 
over the non-negative reals. Then the size of the values computed by the system B during 
an instant is bounded by h nm+1 (c). 

Proof. Because of the read once condition, during an instant a system can perform a 
(successful) read at most n ■ m times. We proceed by induction on the number k of reads 
the system has performed so far to prove that the size of the values is bounded by h k+1 (c). 

k — 0 If no read has been performed, then Theorem El can be applied to show that all 
values have size bound by h(c). 

k > 0 Inductively, the size of the values in the parameters and the registers is bounded 
by h k (c). Theorem El says that all the values that can be computed before performing a 
new read have a size bound by h(h k (c )) = h k+1 [c). □ 

B.7 Combination of LPO and Polynomial Quasi-interpretations 

Theorem 39 \2f\lf a system B terminates by LPO and admits a polynomial quasi-interpre¬ 
tation then the computation of the system in an instant runs in space polynomial in the 
size of the parameters of the threads at the beginning of the instant. 

Proof. We can always choose a polynomial for the function h in corollary El Hence, 
h nm+ 1 a j go a p 0 iy nom i a ]. This shows that the size of all the values computed by the 
system is bounded by a polynomial. The number of values in a frame depends on the 
number of formal parameters and local variables and it can be statically bound. It remains 
to bound the number of frames on the stack. Note that behaviours are tail recursive. 
This means that the stack of each thread contains a frame that never returns a value plus 
possibly a sequence of frames that relate to the evaluation of expressions. 

From this point on, one can follow the proof in D39' The idea is to exploit the characteristics 
of the LPO order: a nested sequence of recursive calls fifvf), ..., / n (v n ) must satisfy 
/i( v i) > ••• > fn (vn), where > is the LPO order on terms. Because of the polynomial 
bound on the size of the values and the characteristics of the LPO on constructors, one 
can provide a polynomial bound on the length of such strictly decreasing sequences and 
therefore a polynomial bound on the size of the stack needed to execute the system. □ 

B.8 Compiled Code is Well-shaped 

Theorem 40 El The shape analysis succeeds on the compilation of a well-formed program. 

Let be be either a behaviour or an expression body, rj be a sequence of variables, and 
E be a sequence of expressions. We say that the triple (be,r),E) is compatible if for all 
variables x free in be, the index i(x, 7]) is defined and if 77 [A;] = x then E[k\ = x. Moreover, 
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we say that the triple is strongly compatible if it is compatible and rj = \E\. In the 
following we will neglect typing issues that offer no particular difficulty. First we prove the 
following lemma. 

Lemma 41 If(e,r),E) is compatible then the shape analysis of C'(e,rj) starting from the 
shape E succeeds and produces a shape E ■ e. 

PROOF. By induction on the structure of e. 

e = x Then C'(x,r /) = load i(x,rj). We know that i(x,rj) is defined and rj[k\ = x implies 
E[k\ = x. So the shape analysis succeeds and produces E ■ x. 

e = c(ei,..., e n ) Then C'(c(ei,..., e n ), rj) = C(e\,rj) ■■ -C"(e n , rj) (build c n). We note 
that if e! is a subexpression of e, e" is another expression, and (e, rj, E) is compatible then 
(e', rj, E ■ e") is compatible too. Thus we can apply the inductive hypothesis to ei,... ,e n 
and derive that the shape analysis of C'(ei,rj) starting from E succeeds and produces 
E ■ ei,..., and the shape analysis of C'(e n ,rj) starting from E ■ e i • ■ • e n _i succeeds and 
produces E ■ e 1 ■ ■ ■ e n . Then by the definition of shape analysis of build we can conclude. 

e = f(e i,..., e n ) An argument similar to the one above applies. □ 

Next we generalise the lemma to behaviours and expression bodies. 

Lemma 42 If (be,r],E) is strongly compatible then the shape analysis ofC(be,rj) starting 
from the shape E succeeds. 

Proof, be = e We have that C(e, rj) = C'(e, rj) ■ return and the shape analysis on C'(e, p) 
succeeds, producing at least one expression. 

be = match x with c(y) then eb\ else eb 2 Following the definition of the compilation 
function, we distinguish two cases: 

• rj = rj ■ x: Then C(be,rj) = (branch c j) ■ C(ebi,r)' ■ y) • (j : C(eb 2 ,rj) ). By the 
hypothesis of strong compatibility, E = E' ■ x and by definition of shape analysis on 
branch we get on the then branch a shape [c(y) /x] E' • y up to variable renaming. We 
observe that (e&i,?/-y, [c(y)/x]E' -y) are strongly compatible (note that here we rely 
on the fact that rj and E' have the same length). Hence, by inductive hypothesis, 
the shape analysis on C(ebi, rj ■ y) succeeds. As for the else branch, we have a shape 
E' ■ x and since {eb-^rj ■ x, E' ■ x) are strongly compatible we derive by inductive 
hypothesis that the shape analysis on C(eb 2 ,r}) succeeds. 

• r) ^krj' -x: The compiled code starts with (load i(x, 7])) which produces a shape E ■ x. 
Then the analysis proceeds as in the previous case. 

be = stop The shape analysis succeeds. 

be = f(e i,..., e n ) By lemma HJ we derive that the shape analysis of C'(e i, rj)- .. .-C'(e n , rj) 
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succeeds and produces E ■ e\---e n . We conclude applying the definition of the shape 
analysis for tcall. 

be = yield.b The instruction yield does not change the shape and we can apply the 
inductive hypothesis on b. 

be = next.g(e ) The instruction next does not change the shape and we can apply the 
inductive hypothesis on g(e). 

be = g := e.b By lemma 1411 we have the shape E ■ e. By definition of the shape analysis 
on write, we get back to the shape E and then we apply the inductive hypothesis on b. 

be = match... The same argument as for expression bodies applies. 

be = read g with Ci(yi) =>• b\ \ ... | c n (y n ) =>- b n | [_j =>■ g(e) We recall that the compiled 
code is: 

jo : (read i(g, rj)) • (branch ci ji) • C(bi,rj • yi) • • • 
j n -i ■ (branch c n j n ) ■ C(b n , g ■ y n ) • j n : (wait j 0 ) • C(g(e),g) 

The read instruction produces a shape E ■ y. Then if a positive branch is selected, we 
have a shape E ■ y*, for k G l..n. We note that the triples (&*,, ?7 ■ y/ c , E ■ y k ) are strongly 
compatible and therefore the inductive hypothesis applies to C(b k ,i 7 • y*) for k G l..n. On 
the other hand, if the last default branch _] is selected then by definition of the shape 
analysis on wait we get back to the shape E and again the inductive hypothesis applies 
to C(g(e), 7]). The case where a pattern can be a variable is similar. 

To conclude the proof we notice that for every function definition /(x) = be, taking 
77 = x = E we have that (be,rj,E) are strongly compatible and thus by lemma W1 1 the 
shape analysis succeeds on C{be,rj) starting from E. □ 
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